RidgeRun Metadata/Streaming Protocols/RTSP

From RidgeRun Developer Wiki

Follow Us On Twitter LinkedIn Email Share this page





NVIDIA partner logo NXP partner logo





Introduction

Real-Time Streaming Protocol (RTSP) is an application-layer protocol used to control media streaming sessions. It doesn't carry media data itself—instead, it allows clients to issue commands like Play, Pause, Seek, and Teardown, while the actual media is typically delivered via RTP.

Key Characteristics

  • Control protocol: Manages streams, but doesn't transport media data.
  • Session-based: Each session uses a unique ID to manage state.
  • Multi-stream support: Handles video, audio, and metadata streams in parallel.
  • Remote control: Enables client-side playback control.
  • Time sync: Uses RTP timestamps and RTCP for synchronization.
  • Flexible transport: Works over UDP, TCP, or RTP interleaved in RTSP (firewall-friendly).

Benefits and Limitations

Benefits

  • Precise control: Enables low-latency, frame-accurate playback—ideal for interactive applications.
  • Multi-stream sessions: Supports video, audio, and metadata in a single session.
  • Low latency: Well-suited for real-time use cases (e.g., robotics, security, teleconferencing).
  • Broad support: Compatible with many devices, software, and streaming servers.

Limitations

  • No native security: RTSP lacks built-in encryption/authentication. Use RTSPS or SRTP for secure streaming.
  • NAT/firewall issues: UDP may be blocked unless TCP interleaving is used.
  • Session complexity: Requires handling of states, timing, and RTCP feedback.
  • Persistent connection: Depends on a continuous client-server session.

RidgeRun compatible products

Among the RidgeRun products that can be used with the RTSP protocol are:

  • SEI (H.264 / H.265): Supported via seiinject → rtph264pay/rtph265pay; SEI metadata is embedded in the video bitstream and transported over RTP/RTSP.
  • MPEG-TS: Supported via mpegtsmux → rtpmp2tpay; metadata is included in the TS and encapsulated in RTP for RTSP transport.

Usage Implications (RTSP + Metadata)

Method Description Implications
RTSP + SEI Embedding metadata as SEI NAL units in H.264/H.265 streams works seamlessly with RTP and RTSP. SEI metadata travels in-band within the video bitstream. Benefits
  • Fully compatible with rtph264pay / rtph265pay and RTSP.
  • No protocol changes: Works without modifying RTP or RTSP.
  • Simple extraction: Use seiextract on the receiver side.


Limitations

  • Requires custom parsing: Receivers must explicitly extract and interpret SEI metadata.
  • SEI may be stripped: Re-encoding or intermediate processing may discard SEI metadata unless explicitly preserved.


RTSP + MPEG-TS MPEG-TS allows metadata to be encapsulated in RTP using rtpmp2tpay, enabling delivery over RTSP. Benefits
  • Flexible metadata support: Supports SCTE-35, custom PID streams, private descriptors, and KLV.
  • Synchronous or asynchronous transport: Metadata can be aligned with video frames or sent independently, using timestamps for synchronization.
  • RTSP-compatible: Integrates seamlessly with gst-rtsp-server, rtspsink, and related infrastructure.
  • Native support: Some commercial players can extract standardized metadata (e.g., KLV) from MPEG-TS streams without requiring custom software.
  • RidgeRun extensions: Enables sync/async KLV metadata transport using dedicated pads in mpegtsmux and tsdemux, aligned with MISB standards.


Limitations

  • Higher overhead: TS encapsulation adds more payload than raw RTP.
  • Metadata extraction: Requires TS demuxing on the receiver side.
  • Loose frame association: Since metadata is transmitted in separate PES streams, it is not inherently tied to video frames. Synchronization relies on timestamps.

Examples

This section presents reference pipelines that demonstrate various combinations of metadata transmission and reception using RidgeRun’s products over the RTSP protocol. Each example typically includes a sender, a receiver, and the corresponding output, showcasing the complete end-to-end flow. The pipelines highlight different metadata injection and extraction methods, such as SEI and MPEG-TS.

In the following examples, we will be using rtspsink, which is a GStreamer element developed by RidgeRun based on the GStreamer RTSP server. This element enables straightforward RTP-over-RTSP streaming directly from the command line, making it especially useful for local testing of encoders, metadata injectors, and transport formats such as MPEG-TS or H.264.

For more information on how to obtain, build, evaluate, and explore additional examples using rtspsink, please refer to the official wiki at: GStreamer rtspsink element – RidgeRun Developer Wiki


SEI

  • Sender
gst-launch-1.0 -v \
  videotestsrc is-live=true ! \
  x264enc tune=zerolatency key-int-max=30 bitrate=2500 ! \
  h264parse config-interval=-1 ! video/x-h264,stream-format=byte-stream,alignment=au ! \
  seimetatimestamp ! \
  seiinject metadata="Hello World" ! \
  mpegtsmux name=mux ! \
  capsfilter caps="video/mpegts, mapping=stream" ! \
  rtspsink service=12345
  • Receiver
GST_DEBUG=seiinject:5,seiextract:5 \
gst-launch-1.0 -v rtspsrc location=rtsp://${IP_ADDRESS}:${PORT}/${MAPPING} ! \
  rtpmp2tdepay ! tsdemux ! h264parse ! seiextract ! fakesink silent=false
  • OutPut
0:00:10.321609855 128170 0x74ec3001b920 WARN              seiextract gstseiextract.c:425:gst_sei_extract_prepare_output_buffer:<seiextract0> Identify nalu operation unsuccessful
/GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = chain   ******* (fakesink0:sink) (6245 bytes, dts: 0:00:09.999983296, pts: 0:00:09.999983296, duration: 0:00:00.033333333, offset: 1847970, offset_end: -1, flags: 00002600 marker header delta-unit , meta: GstSeiMeta) 0x74ec18023900

MPEG-TS

  • Sender
gst-launch-1.0 -e \
  metasrc period=1 metadata="hello" ! 'meta/x-klv' ! \
  mpegtsmux name=mux alignment=7 ! \
  capsfilter caps="video/mpegts, mapping=stream1" ! \
  rtspsink service=5005 async-handling=true \
  videotestsrc is-live=true ! video/x-raw,format=I420,width=320,height=240,framerate=30/1 ! \
  x264enc tune=zerolatency key-int-max=60 byte-stream=true ! \
  h264parse config-interval=-1 stream-format=byte-stream alignment=au ! mux.
  • Receiver
gst-launch-1.0 -v \
  rtspsrc location=rtsp://127.0.0.1:5005/stream1 latency=0 ! \
  rtpmp2tdepay ! tsdemux name=demux \
  demux. ! queue ! h264parse ! avdec_h264 ! videoconvert ! fpsdisplaysink text-overlay=false sync=false \
  demux. ! queue ! 'meta/x-klv' ! fakesink dump=true async=true sync=false
  • OutPut
00000000 (0x7f346c035f80): 68 65 6c 6c 6f 00                                hello.          
00000000 (0x7f346c035f60): 68 65 6c 6c 6f 00                                hello.          
00000000 (0x7f346c0b37a0): 68 65 6c 6c 6f 00                                hello.

References